Corpus: bos_wikipedia_2021_30K

Other corpora

4.4.1.5 Number of Word-N-grams at Sentence Endings

Number of word-N-grams for N=1...5 for the first K sentences

K # of words # of bigrams # of trigrams # of 4-grams # of 5-grams
100 96 98 98 98 98
1000 900 992 997 998 998
10000 6953 9561 9896 9964 9989
100000 16925 27560 29465 29850 29944
1000000 16925 27560 29465 29850 29944


Zipf's diagram for sentence endings


Gnuplot diagram

3056 msec needed at 2021-06-18 04:03